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Abstract 

To verify that a health management system (HMS) performs as expected, a virtual system simulation 
capability, including interaction with the associated platform or vehicle, very likely will need to be developed. 
The rationale for developing this capability is discussed and includes the limited capability to seed faults into 
the actual target system due to the risk of potential damage to high value hardware. The capability envisioned 
would accurately reproduce the propagation of a fault or failure as observed by sensors located at strategic 
locations on and around the target system and would also accurately reproduce the control system and 
vehicle response. In this way, HMS operation can be exercised over a broad range of conditions to verify that 
it meets requirements for accurate, timely response to actual faults with adequate margin against false and 
missed detections. An overview is also presented of a real-time rocket propulsion health management system 
laboratory which is available for future rocket engine programs. The health management elements and 
approaches of this lab are directly applicable for future space systems. In this paper the various components 
are discussed and the general fault detection, diagnosis, isolation and the response (FDIR) concept is 
presented. Additionally, the complexities of V&V (Verification and Validation) for advanced algorithms and 
the simulation capabilities required to meet the changing state-of-the-art in HMS are discussed. 

Nomenclature 


HMC 

=: 

Health Management Computer 

HMS 


Health Management System 

NGLT 

= 

Next Generation Launch Technology 

OPAD 

= 

Optical Plume Anomaly Detection 

RPP 

= 

Rocketdyne Propulsion and Power 

RTOS 

= 

Real-Time Operating System 

RTM 

= 

Real-Time Model 

RTS 

= 

Real-Time System 

RTVMS 

= 

Real-Time Vibration Monitoring System 

SBC 

= 

Single Board Computer 

SSME 

= 

Space Shuttle Main Engine 

V&V 

= 

Verification and Validation 


I. HMS In Space Exploration 

Integrated health management has been identified as an enabling technology for the Space Exploration 
Initiative. Benefits of such a system include increased safety (mitigation action or warning of impending critical 
failure), reliability (keep the system operating through component failure if possible) and reduced operating costs 
(faster turnaround due to automated health check). These systems may be expected to continuously monitor system 
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operation, verify that a system is ready to re-start after prolonged shutdown, assess the criticality of faults and 
identify faulty components without requiring unnecessary disassembly in hostile environments. 


II. HMS Development 


Many modem algorithms require data for the purpose of 
training, setting thresholds, and even architecture design. The latter 
stems from the fact that one must be able to establish whether a 
system design will meet requirements or if additional sensor 
coverage is needed. Figure 1 illustrates in highly simplified form a 
development path for a HMS. It begins with a failure mode 
analysis to identify the potential failure points and associated 
modes. Taking into account criticality, probability of occurrence, 
and ability to mitigate, a list of targeted faults is identified and 
requirements are established for detection and mitigation. At this 
point, sufficiently detailed simulations are developed with the 
ability to replicate the targeted fault modes. Process noise models 
and control system models are also required because of their 
impact on HMS design. We have found that the desirability of 
using simulations increases with the complexity of the system 
since it is difficult to predict through thought processes alone what 
the system will do in the presence of a fault. The next step is to 
select a sensor group that provides the most diagnostic coverage 
while remaining within constraints on, for example, bandwidth and 
weight. Preliminary detection, isolation, and prognostic algorithms 
are then developed and the system effectiveness is evaluated to 
determine if the original requirements are met. If not, additional 
sensor and algorithm development may be called for and a 
pushback on requirements or design may be needed. Finally, the 
HMS is tested thoroughly before being activated on the target 
platform. Not illustrated here are the continuing design updates 
that affect requirements, simulation models, and sensor selection. 



Figure 1. HMS Development Path 


A. Sensor Selection 

Sensor data availability and accuracy as well as the fidelity and speed of diagnostic/prognostic algorithms that 
utilize this data ultimately define the capabilities and limits of a HMS. There are many technical and logistical 
considerations that impact the choice of sensors for a health management system however the overarching objective 
is to select sensors that enable system diagnostics/prognostics for decision support, risk reduction, and efficient 
operation. Selection of a sensor suite that optimizes HMS diagnostic capability measures is therefore advantageous, 
especially so if accomplished in the early stages of host system design. Realistic estimates of HMS capabilities and 
limits early in the design cycle facilitate the use of health considerations in host system design decisions and support 
cost effective HMS integration. 

A systematic sensor selection strategy applied early in the host system design cycle integrates and utilizes 
multiple inputs pertinent to HMS development. These would include the following: 1) definition of relevant system 
operating modes and transitions; 2) identification of critical and/or targeted failure modes and characterization of the 
risks associated with these modes; 3) definition of candidate sensor types, locations, and properties; 4) 
characterization of normal sensor signal fluctuation including both random and systematic components; and 5) 
definition of fault detection and isolation thresholds consistent with false alarm constraints. A sufficiently detailed 
fault simulation capability and preliminary diagnostic/prognostic algorithms are also needed for sensor selection to 
be a useful design phase process. Effective assessment of HMS diagnostic/prognostic capability for targeted fault 
modes requires realistic simulation of fault evolution and effects propagation to sensor locations and output. Fault 
simulation data for selected sensors can then be utilized by diagnostic/prognostic algorithms to assess system state. 
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Systematic sensor selection is used to identify the sensor suite that optimizes performance of the 
diagnostic/prognostic algorithms for targeted fault modes 2 . Performance is measured quantitatively in terms of risk 
reduction potential based on coverage, timeliness, and fidelity of targeted fault detection and discrimination. The 
same performance metric can be used to quantitatively assess the impact of the loss or addition of individual sensors 
as well as the removal or addition of targeted failure modes and the maturation of health management software. 
This places the sensor selection process in the design iteration loop - facilitating design decisions based on 
knowledge and system hardware/software evolution. 

B. HMS Diagnostic Algorithms 

The monitoring component of health management is primarily concerned with assessing the health state of 
system components and the integrated system during operations. A suite of diagnostic algorithms forms the logical 
core of the health monitoring system. Fault detection and isolation are the short time scale diagnostic functions 
provided by these algorithms. Condition monitoring and trending are long time scale diagnostic functions that, 
when combined with available life limit information, provide the foundation for prognostics. The experience base 
for related systems provides a foundation for development of both system simulation models and diagnostic 
algorithms. In the design stage, performance models are generally constructed to characterize normal system 
operation, including sensible conditions, for a variety of design options. As performance models mature with the 
system design, parameter values that describe normal component function and interaction also mature. Perturbation 
of these parameters can be used to characterize anomalous behavior by component, thereby providing a basis for 
controllable fault simulation. 


III. The Problem 

To be most effective, a HMS should be an integral part of the system design and should have sufficient 
sensitivity without inducing false detections or misidentification of a fault. Effectiveness of the solution also factors 
into cost/benefit analyses and risk management. But in order to effectively design and test a HMS, sufficiently 
detailed characterization of the targeted fault modes is required. Characterization includes propagation modes and 
rates, frequency of occurrence, critical limits, impact on system performance and system response among others. 
Depending on the target system, this information may be garnered from analysis of previous occurrences of the fault 
or it may be obtained from component or system testing, often to failure. However, for systems such as main 
propulsion, fault experience may be minimal to the benefit of the program but to the detriment of the HMS designer. 
In addition, what failure history does exist may be for a different design or engine cycle. Testing also tends to be 
impractical due to the high cost and associated risk to critical hardware. 

A. Limited Fault Database 

One major problem that HMS designers face is the lack of data with which to establish thresholds, train 
algorithms and perform verification and validation (V&V). Space hardware typically is produced in relatively small 
quantities and does not accumulate much runtime as compared with systems such as jet engines. For example, the 
Space Shuttle Main Engine (SSME) is only required to operate for just over 500 seconds per flight. Total testing for 
the SSME recently achieved the one million second mark for total firing time but this was over a 30 year period. 
Failures that have occurred with the SSME were mainly during early testing. The SSME has undergone a number of 
upgrades that essentially have created a new engine design. Failures have and do still occur during system testing. 
This will often result in a change in design or other measures that prevent the fault from reoccurring 1 . For the HMS 
designer, this means that a failure mode is removed from consideration or, at a minimum, that the probability of 
occurrence has been reduced significantly. For HMS designers, then, the existing database may be unsuitable for 
three reasons. First, data that exists may not cover the entire targeted fault list and if it does, represents only a single 
data point. Second, the existing data is typically for a different system design. And finally, as noted above, the data 
may exist for a fault that has been designed out and is no longer relevant or has minimal impact. 


B. Impracticality of Seeding Faults 

To characterize a fault mode, one can seed the fault into the actual system and record the results. However, for 
complex systems, this is usually impractical. One problem is the cost. A typical hotfire test of a rocket engine can 
cost on the order of several hundred thousand dollars. A minimal test series for an HMS targeting 20 or so fault 
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modes could easily end up costing millions of dollars for testing alone not even considering the cost of “repairing” 
the engine after each test. Also, modem test programs are run with very little developmental hardware. To place a 
test article at risk to characterize a fault mode is not practical. Another problem is that it can be difficult to mimic a 
critical failure realistically. For example, a leak can be simulated by opening a valve but this is not representative of 
the actual failure. 

C. Multidisciplinary Nature of Envisioned HMS Systems 

Based on our experience, a future HMS can be expected to receive inputs from various sensor types. For 
example, rocket engines of the future may carry an HMS which simultaneously monitors internal pressures and 
temperatures, vibration levels of key components, sniffers and cameras monitoring the external environment and 
circuitry which monitors the state of sensors and their associated electronics. These various inputs must be validated 
and combined to produce a reliable picture of the current engine state. In order to test such an HMS, appropriate test 
signals must be generated to ensure that the overall supervisory, or fusion, algorithm correctly diagnoses an anomaly 
and takes the expected corrective action. The synchronization of these test signals must be such that it accurately 
reflects the real world manifestation of the fault under test. 

IV. A Solution 


A, Multi-Disciplinary Simulation 

An obvious solution that comes to mind is the use of multi-disciplinary simulation. In other words, various 
types of models are integrated to provide the necessary fault characterization from the required fluid, structural, 
thermal, etc. viewpoint. For a complex system such as a rocket engine, this is a very difficult task. Turbulence, 
multi-phase flow, boundary layers, combustion mixing and burning, blade wakes, high heat flux and shock waves 
are just a few of the physical phenomena one has to contend with which, even with 3-D modeling techniques, are 
difficult to capture accurately. While it may be possible to create a high level simulation using finite element fluid, 
thermal and structural models, the time required to arrive at a solution would be prohibitive, at least with current 
processors, and certainly not practical for HMS development and test. Also, many high level models produce static 
solutions which are unsuitable for this type of simulation. 


B. Real-Time Requirement for V&V 

As stated above, highly detailed models are impractical for the purposes of HMS development. Based on our 
experience, thousands of individual runs must be performed to obtain the required data. Also, the most likely final 
step in the V&V of a health management system is an end-to-end test in a hard ware-in- the-ioop facility. This 
ensures that the system performs its function in the most realistic setting possible. This was the preferred method of 
V&V for SSME and XRS-2200 (X-33) controllers and was, and still is, performed in Huntsville, Alabama 
simulation labs. This V&V requirement drives the requirement for real-time capable models but one must also 
consider the practicality of employing high-level models. 

C. Possible Ways to Achieve Real-Time 

There exist a number of techniques to achieve real-time simulation. One is to run non-real-time detailed 
simulations and play back the results in real-time. However, this introduces timing issues and severely limits the 
ability to alter the parameters of the experiment. A second option is to apply parallel computing techniques. RPP has 
explored this option for converting a detailed 0-D fluid system model to real-time and has found it to be viable with 
current processors. Another option is to extract the essence of highly detailed models, often single point or static, by 
performing a multitude of simulations throughout the envelope of operation and “stitching” together the results to 
achieve pseudo-continuous operation. The drawback is that this adds another layer of abstraction which typically 
reduces the accuracy of the result, especially as one moves further away from an anchor point. Also, dynamic 
response is questionable unless a high order transient model is used to generate the appropriate lags and delays. 
Another technique that can be used is to simplify higher order models. RPP has developed a family of real-time 
models (RTM) originally used for controller V&V in a hardware- in- the-loop environment. This original function has 
been expanded to include control law development, trajectory analysis and HMS development and test. The 
drawback to this approach is that some model fidelity, both static and dynamic, is traded for speed. These models 
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have also been employed in a new testbed or HMS laboratory that allows testing of a HMS for an engine which has 
yet to be built. 

Real-time simulation capability provides a significant improvement in testability of real-time diagnostics and 
the complete HMS decision cycle through control response. Real-time diagnostic procedures must effectively 
address the challenging trade between HMS response speed and fidelity. Inverse models are a natural design phase 
development for real-time fault diagnostics. They work by inverting the engine performance model, that is, by 
determining component parameter values most consistent with observed sensible conditions. In this manner, 
abnormal sensible conditions can be traced directly to the anomaly initiating component(s). This provides a natural 
method for both fault detection and isolation. Inversion diagnostics can be constructed at a level of complexity 
consistent with the appropriate trade of response time and fault discrimination fidelity. Linear influence coefficient 
models occupy the highest speed, lowest fidelity end of the diagnostic capability spectrum and full performance 
model inversions that retain complete nonlinear process dynamics occupy the lowest speed, highest fidelity end of 
the capabilities trade space. A variety of performance model levels and characteristics can be implemented to 
explore/extend the trade space consistent with HMS requirements. 


V. Rocketdyne’s HMS Laboratory 


A. Overview 

To meet a scheduled goal for the Next Generation Launch Technology (NGLT) program that called for a 
testbed demonstration, RPP and NAS A-Glenn Research Center jointly developed a laboratory for the design, test, 
and demonstration of a rocket engine health management system. The laboratory consists of six processors in five 
computers that collectively simulate the operation, control and monitoring of the RS-84 oxygen-RP 1 booster engine 
in real-time. Two processors in the same simulation computer (an Applied Dynamics RTS) simulate the engine 
itself and its controller, one serves as host for the simulation computer and allows for on-the-fly fault insertion, 
another serves as the command simulator, another displays real-time data traces and the fifth is the health 
management computer. 


933 MHz PC 



Real-Time 

WinPlot 

Display 




Health 

Management 

Computer 


Figure 2. HMS Testbed Configuration 
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The laboratory is designed to mimic, as much as possible, the configuration and functions of a modem hot- fire 
test or flight configuration. The controller executes vehicle commands and acquires sensor data. Data transfer 
between the controller and command and control computer uses commonly used scaled integer while all other 
communication is accomplished using network hardware. Sensor redundancy is consistent with the latest list 
provided by our Instrumentation Group. Noise levels are best estimates based on previous engine designs such as the 
SSME. Fault insertion is accomplished by altering engine parameters, such as turbine efficiencies, using direct 
memory insertion. 

The engine simulation is a simplified physics-based model that employs standard fluid flow, heat transfer and 
thermodynamic equations along with property tables and machine performance maps to accurately predict engine 
performance. This type of model is referred to as a real-time model (RTM) and was originally developed for 
hardware-in-the-loop V&V of engine controllers but has also recently been used for anomaly resolution, control law 
development, and trajectory analysis 3 . For this project, the nominal model was augmented with sensor noise and the 
ability to model expected build and test-to-test variability. The RTM has a nominal step time of 0.0005 sec for a 
refresh rate of 2000 Hz. The RTM responds to valve positions which are supplied by an external valve actuator 
simulation which itself responds to valve position commands from the controller simulation. 

The controller simulation serves to start and shutdown the engine and maintains items such as thrust and 
mixture ratio. Commands from the vehicle simulator are received and executed by the controller through valve 
position commands generated internally by control laws. The controller simulation is also responsible for acquiring 
engine data which is transmitted to the vehicle simulator. In the current testbed implementation, the controller 
cycles at a 50 Hz rate. 

The vehicle simulator issues commands to the controller. The commands are read from a text format test profile 
or the operator may issue commands on-the-fly. An engine display provides a continuous update of key engine 
parameters. Radio buttons on the display allow the operator to set inhibits on shutdown, throttle and mixture ratio 
change. The purpose of the inhibits is to select which mitigation actions received from the health management 
computer may be executed. This capability envisions that a future engine is part of a cluster which is managed by a 
higher level supervisory routine which sets or clears inhibits based on mission timeline, propellant consumption and 
the availability of other engines to take up the load for a malfunctioning engine. 

The health management computer performs sensor validation, fault detection, diagnosis and mitigation. These 
functions are detailed below. 

The RTS has the only real-time operating system in the testbed Synchronization is maintained by sending data 
packets at a known interval (every 20 milliseconds) and using this data exchange to trigger algorithm updates. The 
HMC runs two forked processes with only one keyed to the master data exchange clock. The diagnostic routine is 
allowed to run freely which produces somewhat inconsistent performance. Future updates to the testbed include 
adding additional RTOS’s to ensure determinism. 

B. Concept 

The sensor validation algorithm uses standard techniques to detect abnormal sensor readings. The first is a 
limits check to verify that the sensor reading is within expected bounds. The second is a noise check to determine if 
the sensor is displaying too little or too much noise which is indicative of frozen pressure lines or electrical 
problems. 

A total of 24 optimized sensor locations are continuously monitored by the HMC. Vehicle commands are also 
fed to the HMC where they are used to command an internal engine and controller simulation which serves as the 
truth model. This truth model is used to trigger the diagnostic algorithm if a discrepancy is detected. The use of a 
trigger mechanism is employed to minimize the possibility of false detections. 

The diagnostic algorithm consists of a two modules which together comprise the Inverse Model. The Inverse 
Model is so named because it reverses the operation of a typical model. A typical model generates predictions of 
measurable parameters based on control inputs, boundary conditions and usually immeasurable hardware parameters 
such as pump efficiency. The Inverse Model accepts sensor inputs, control inputs and inlet conditions and estimates 
the hardware parameters that most likely produced them. Having nominal values for these parameters available, it is 
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then a relatively easy task to detect changes. In addition, knowing which parameters have changed indicates which 
components or components is faulty. The first module of the Inverse Model generates predictions of the engine state 
based on control inputs and inlet conditions while the second module actually performs the diagnostic function. The 
Inverse Model is described in more detail in an associated paper 4 . 

In addition to the Inverse Model, a fast response algorithm is to be added which would take immediate action in 
the presence of a major fault which does not require diagnostics. Also, a fusion algorithm is envisioned that would 
poll HMS subsystems, act as a referee in case of conflicting inputs and select and transmit recommended corrective 
actions to the engine controller. 


VI. Future Needs 


A. Structural Sensors 

The concept presented is limited to what we refer to as performance data: temperatures, pressures, flowrates, 
and turbopump speeds. It can be expected that new systems will also monitor structural sensors as well. One 
example is the Real-Time Vibration Monitoring System (RTVMS) which employs accelerometers located on key 
components of the engine such as the turbopumps 3 . For V&V, output from these sensors will also need to be 
provided and or generated. These signals may take the form of time varying voltage but the frequency content may 
be computationally difficult to reproduce. A higher-level intermediate signal such as a power spectral density may 
be more appropriate until processor capability improves. 

B. Other Types of Sensors 

In addition to flowpath and structural sensors, the HMS suite may also include cameras, spectral analyzers, 
sniffers, microphones and magnetic pickups. Each of these sensors produces a different type of signal that would 
have to be mimicked in a comprehensive V&V program. For example, infrared cameras would produce a pixel map 
with the color of each pixel indicating its temperature. Perhaps machine vision software would interpret the image as 
a local hot spot caused by unwanted burning or a leak of hot gas or cryogenic propellant. A plume spectroscopy 
system such as OP AD would monitor the exhaust plume and would produce spectral lines of varying amplitude 
indicating the type and quantity of material being ejected 4 . This might make it possible to discern the source of the 
material and the rate of erosion. Multiple indications of the same fault would greatly aid in diagnosis and allow a 
corrective action to be taken with the utmost confidence. 

C. Simulation Capability To Accommodate New Sensors 

Work is currently underway to add structural simulation capability to the HMS laboratory. The general plan is 
to recreate the dynamic forces that cause engine vibration and validate the results with existing data. This task is 
much more difficult than simulating performance data due to the complexity involved. Eventually, the capability to 
reproduce other types of signals will be added. It is expected that this will be non-trivial. Recreating, for example, 
an infrared image of the engine will require having a thermal model along with diffusion models to recreate 
propellant leaks. It remains to be seen how accurately the models recreate the manifestation of a fault as observed by 
different sensors. It also remains to be seen how the models can be anchored with enough confidence to allow their 
results to be used for V&V of an undoubtedly flight critical system. 


VII. Conclusion 

Verification and validation of a complex system such as a rocket engine will most likely require the 
development of real-time multidisciplinary simulation capability. Alternative methods such as seeding faults, 
testing-to-failure and using historical data are either impractical or insufficient. A testbed or lab as presented here is 
a first step toward the type of simulation capability required, a capability which allows realistic simulations of faults 
and failures as observed by a variety of sensors. This goal will not be easy to achieve due to the complexity of the 
system. Current tools are also not configured for this type of real-time transient simulation. In addition, the ability to 
anchor or validate the results will be very challenging. However, an HMS is unlikely to be activated unless it has 
gone through rigorous testing to prove it safe, robust and effective. 
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